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Abstract.  In  this  report  wc  consider  the  problem  of  estimating  the  vectors  o£ 
location  parameters  in  the  multivariate  one  sample  and  two  sample  problems. 
These  estimators  arc  obtained  tlirough  the  use  of  the  multivariate  rank  order 
statistics  such  as  the  Wilcoxon  or  the  normal  scores  statistic  considered  by 
the  authors  in  [13]  and  [18]  for  the  corresponding  testing  problems.   Tlie 
distribution  of  these  estimators  is  shovm  to  be  symmetric  with  respect  to  the 
parameters  being  estimated.   These  estimators  are  translation  invariant,  robust 
and  asymptotically  normal.   Their  asymptotic  relative  efficiencies  with  respect 
to  the  estimators  based  on  the  vector  of  means  and  medians  are  discussed  by 
applying  the  criterion  of  Wilks  generalized  variance  (cf.  [1],  p.  166).   In 
particular,  it  is  shown  that  the  estimators  based  on  the  multivariate  normal 
scores  statistics  are  asymptotically  as  efficient  as  the  ones  based  on  the 
method  of  least  squares  when  the  parent  distributions  are  normal. 

1.   Introduction.   Let  us  first  consider  the  multivariate  one  sample  problem. 
In  the  multivariate  one  sample  problem,  we  consider  N  independent  and  identical- 
ly distributed  vector  valued  random  variables  Z   =  (Z,  ,...,Z   ),  a=l,...,N, 

^-a     la     pa 

distributed  according  to  an  absolutely  continuous  p-variate  cdf  (cumulative 
distribution  function),  F(Z-^)  where  ^  =  (Z  ,,..,Z  )  and  6  =  (9,,..., 9  ).   We 
assume  that  F(x)  is  diagonally  symmetric  about  0,  the  null  vector,  that  is, 
the  density  f(x)  of  F(x)  is  invariant  under  simultaneous  changes  of  signs  of 
all  coordinate  variates.   Two  problems  often  studied  in  this  set  up  are: 


(i)  to  test  the  liypothcsis  ^=f^  versus  !^/^,  '^nd  (ii)  to  obtain  the  point  esti- 
mator of  tht>  shift  0.  The  classical  approach  to  the  testing  problem  is  based 
on  the  Hotelling  T^-statistic  T^  =  N  ^  s"  ;^'  which  has  (when  J^=;^)  a  known 

distribution  asyinptotically  converging  to  the  chi  square  distribution  with  p 

? 

^degrees   of   freedom  under   certain   general  assumptions   on  F(^) ,    (cf    [1]).      Here 

N 

S  e  J^    (/^"^M^^^  ~^M^/(^~-^^  ^^  *-^^  covariance  matrix  of  the  sample,  and 

a=l      _  N 

Z  =  (Z  , , . . . ,Z  ,  ) ,  Z„,  =  y  Z.  /N.   Similarly  the  classical  point  estimator 
'ON     Nl    *  Np  '   Ni    ''t  la  ^ 

^  a=l 

for  ^  is  I. 

We  now  consider  the  multivariate  two  sample  problem.  Here  we  have 

;^  =  (Xict'"*''V'  ""^'•••♦"^  ^"^^   ^3  "  ^^13'- "'^pa^'  2=1.  •••.n  independent 
samples  of  sizes  m  and  n  =  N-m  from  the  p-variate  absolutely  continuous  cdf s 
F(x)  and  F(^-A)  respectively,  where  p^  =  (x,,...,x  )  and  /^   =  (A  ,  ...,A  ). 
Then,  analogous  to  the  one  sample  problem,  the  two  problems  often  studied  in 
this  set  up  are:  (i)  to  test  the  hypothesis  f^=^   against  /^?^0,  and  (ii)  to  esti- 


ma 


te  the  shift  A.   The  classical  approach  to  the  testing  problem  is  based  on 


the  Hotelling's  statistic  T^  =  ^  (l^-^) s'^ i^^-)^)  '   which  has  (when  4=]^) 
asymptotically  the  chi  square  distribution  with  p  degrees  of  freedom  under 

certain  assumptions  on  F(;^).   Here  ^  =  (X^^,  .  .  .  .X^^  )  ,  ^^  =  (^^i*  •  •  •  .^Np^  ' 

m  n 

JL, .  =   y  X.  /m,  Y^, .  =   y  Y.  /n  and  S  is  the  pooled  with  in  sample  covariance 

a=l  a=l 

matrix.   Similarly,  the  classical  estimator  of  y^  is  /^  =  (j;^-^)  . 

These  methods  are,  however,  known  to  be  vulnerable  to  gross  errors. 

For  testing  problems,  this  difficulty  was  overcome  successfully  by  the  present 

authors  among  others  (cf  [13]  and  [18]  and  the  references  cited  therein).   In 

[13]  and  [18],  the  authors  considered  the  multivariate  rank  tests  such  as  the 

Wilcoxon  and  the  normal  scores  tests,  and  e/itablished  their  robustness 

superiority  against  the  tests  based  on  the  Hotelling's  T  -statistics.   This 

paper  examines  the  properties  of  the  estimators  of  6  and  A  through  the  use  of 


the  rank  order  st.iLlsLicy  cons  i  do  red  in  [13]  aiid  [  1  !>]  . 

The  interest  in  providing  tho  robu.st  cstimnLors  tlirouj-ji  tlic  ur,c   of  rank 
order  tests  initiated  wiLti  the  work  oT  ilodf.os  and  I.olim.-inn  [7]  and  Sen  [17]  . 
They  considered  the  problems  of  estimating  location  parameters  in  the  corres- 
ponding univ.iriate  problems  and  proved  tiiat  tlie  asymptotic  relative  efficiencies 
of  their  estimators  relative  to  the  classical  estimators  is  the  same  as  the 
Pitman  efficiency  of  the  rank  tests  (on  wliich  their  estimators  are  based)  to 
the  corresponding  t-tests.   The  work  in  this  direction  has  since  been  continued 
by  various  workers.   For  the  details  of  references  the  reader  is  referred  to 

the  papers  of  Birnbaum  and  Laska  [J>,^]    and  Sen  [17]. 

2.   Point  Estimates.   Suppose  Z  =  (Z   ,...,Z   ),  a=l,...,N  is  an  independent 
sample  of  size  N  from  a  p-variate  cdf  (cumulative  distribution  function)  F(x- 
F(x-^)  =  F(x  -9  ,...,x  -0  )  where  F(x)  is  absolutely  continuous,  and  is  diagonal- 
ly symmetric  about  0,  that  is,  the  density  f(^)  of  F(x)  remains  invariant  under 
simultaneous  changes  of  signs  of  all  the  coordinate  variates.   The  estimator  of 
6  which  is  most  commonly  used,  is  the  Mean  estimator 

and,  it  is  well  known,  that  this  estimator  is  minimum  variance  unbiased  if  F 
is  normal. 

The  object  of  this  investigation  is  to  consider  some  nonparamctric 
estimators  of  0,  and  compare  their  performances  with  that  of  Z  .   These  esti- 
mators are  based  on  a  class  of  Chernof f-Savagc  [3]  rank  order  statistics 

<2-2>  a^"  (^1 V 

where 


C.,   =  1  if  Che  a   smallest  observation  among  |z  .  ^  | ,  .  .  .  ,  |z   |  is  from  a 
positive  Z,  and  E\-^      •=  0  otherwise,  j  =  l,...,p,  and  E^,-"   «=  J^,..(a/N+1)  is 
the  expected  value  of  the  a   order  stntislic  of  a  sample  of  size  N  from  a 
distribution  Y-Kx)  given  by 

(2.4)  4'^-(x)  =  Y.(x)-H'.(-x)  if  x_>0,  and  0  otherwise. 

The  function  Jq.    is  defined  only  at  1/(N'+1)  ,  . .  .  ,N/(N+1)  ,  but  we  may  extend 
its  domain  of  definition  to  (0,1)  by  letting  it  have  constant  value  over 
[a/(N+l) , (od-1) /(N+1) ) .   Furthermore,  we  make  the  following  assumptions: 


Assumption  I.   H'.(x)  is  symmetric  about  x=0,  that  is,  Y.  (x)+'i'.  (-x)=l, 

3  =  1 P" 

Assumption  II.   J"^ .  (u)-*-J':(u)  =  ^* .    (u)  for  0<u<l  and  is  not  constant. 

Nj  J  J 


N 

I 
a=l 


Assumption   III.      ^     ll^^-^f'^^    ^^    =  oil^S  . 


Assumption   IV.      J.(u)    =   Y.    (u)    is   absolutely   con-tinuous   and 

ljf^\u)=|d^^^J.(u)/d^il    <  K[u(l-u)]'^"^"^,    i=0,l,2 

for  some  K  and  some  6>0. 

The  statistics  given  by  (2.2)  and  (2.3)  and  satisfying  the  assumptions  I, 
II,  III  and  IV  include  as  special  cases  a  number  of  well  known  test  statistics. 
The  two  important  ones  often  studied  in  nonparametric  literature  are  the 
Wilcoxon  signed-rank  statistics  and  the  normal  scores  statistics  obtained 
by  taking  for  H'.(x),  the  rectangular  distribution  over  (-1,1)  and  the  standard 
normal  distribution  respectively. 
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Now  by  dcfinilion,  li'"'  (Z  .  -n  .  ,  . .  .  ,Z  „ -n  .)  is  non  incrcnr;iny  in  a.  for 

Nj   jl   J       jN   J  J 

each  i  =  l,...,p.   Furtlicrmore,  the  varial)]c.s  /- . , -Q  .  ,  .  .  .  ,Z  ..  -6  .  arc  inclar)endcnC 
.^l^d  identically  distributed  according  to  the  niar^^inal  cdf  F ,  (x)  which  is 
symmetric  about  x=0.   It  fo]]ows  tliat  the  distribution  of  h-'; .  (Z  .  ^ -0  . ,  .  . .  ,Z  .,,-0  .) 

is  symmetric  about  a  known  value  \i'H.-       [It  is  easy  to  verify  that  \i'^ .    "  h^^^    \z\ 

1 
which  equals  h   or  accoi'ding  as  h/;.  is  the  Wilcoxon  signed  rank  statistic 

or  the  normal  scores  statistic  respectively].   Let  for  each  j=l,...,p 

(2.5)  0j  ■=  sup{o.:  hti.iz.,-Q. Z-M-6.)  >  y;^ > 

and 

(2.6)  e-  =  inf{e.:  h^.(z.^-e. Z.^-6.)  <  p^.} 

As  natural  extension  of  [9],  we  propose 

(2.7)  4  =  ^Qn1"-"V*^NJ  •=  (0J+Qf)/2,  j  =  l P 

as  the  estimator  of  the  location  parameter  6. 

Proceeding  precisely  as  in  [y]  and  [16],  it  follows  that  B      is  a 

translation  invariant  estimator  of  9,  its  distribution  is  diagonally  sym- 

metric  about  B,    and  is  absolutely  continuous.   The  estimator  0   forms  a  general 

class  of  the  estimators  of  B.AFew  important  members  of  this  class  are  as 

follows:   (i)   the  Wilcoxon  estimator  ^„/r,N  resulting  from  the  Wilcoxon  signed 

rank  statistics  by  taking  for  4'v(x)  the  rectangular  distribution  over  (0,1). 

In  this  case  the  estimator  9„./„\  of  9.  is  the  median  of  the  variables 

Nj  (R)     J 

W^''"\...,W^^\  where  W^"*"' <.  .  .  <W^^^  are  the  k=N(N+l)/2  ordered  values  of  the 
averages  (Z  +Z   )/2  for  l_<a<3<N,  j  =  l. •••,?•  (cf  [?])•   Also  refer  to  an 
interesting  paper  of  Moses  [15]  which  contains  a  graphical  procedure  for 
computing  Q^.(-r,\    based  on  the  Wilcoxon  statistic.   (ii)   the  normal  scores 


estimator  ©v,/^-.  resulting  from  the  normal  scores  statistics  by  taking  for 

4'.(x)  the  standard  normal  distribution.   The  estimator  9vr/*\  cannot,  however, 

be  expressed  as  a  simple  function  of  Z's.   Usually  an  iterative  procedure  is 
employed  to  compute  9,j/x^\«   This  procedure  is  illustrated  in  the  appendix. 

We  now  consider  the  two  sample  problem.   Let  X  =  (X,  ,...,X  ), 

^   ^  ^'a     la      pa 

a=l,...,m  and  Y  =  (Y^  , . . . ,Y  „) ,  B=l,...,n  be  two  independent  random  samples 

^p      ip      pp 

of  sizes  m  and  n  from  the  p-variate  absolutely  continuous  cdf's  F(x  )  and 
F(;ic-^)  respectively  where  ^  =    (x  ,...,x  )  and  A  =  (A  ,...,A  ).   Analogous  to 
the  one  sample  problem,  we  shall  consider  the  extimators  of  A  based  on  the  two 
sample  rank  tests  studied  in  [13].   For  each  coordinate,  consider  the  Chernoff- 
Savage  [5]  statistic 


m 


(2.8)  h...  =  h„.(X.T  ,...,X.  ,Y.-,,...,Y.  )  =  l        eJ;P  Z^,^^  / 

Nj     Nj  ^  jl'    '  jm'  jl'    '  jn    ^_^   Na  Na 

where  Z.,   =  1  if  the  a   smallest  observation  from  the  combined  sample 

Na  ^ 

(X.T,...,X.  ,Y.T,...,Y.  )  is  an  X  observation  and  Z^,   =  0  otherwise,  and 
jl      jm  jl      jn  Na 

E^   =  J   (a/N+1)  is  the  expected  value  of  the  a   order  statistic  of  a 
sample  of  size  N=m+n  from  some  known  distribution  H'.(x)  which  satisfies  the 
assumptions  II  and  III  and  IV.   Let  for  each  j=l,...,p 

(2.9)  A*  =  sup  {A:   h., .  (X.  ^  , . .  .  ,X.  ;Y  .  ,-A  .  ,  . .  .  ,Y.  -A.  )  >  p., .  } 

J        2  Nj   jl'    '  jm'  jl   J  '     jn  J     N3 

and 

(2.10)  A**  =  inf{A.:   h, .  (X. ,  , .  . .  ,X.  ;Y.  ,-A.  , . .  .  ,Y.  -A.  )  <  y^^.  } 


wiicrc;   II        is    the   poiuL   of   synmicLry   of    uhc   dir.LribuLion   of   h        wiicn   A.-O, 
j  =  l,...,p.      Tlicn,    as   a  natural  extension   of   llodccs   and   Lchmann    [7]    i»nd  Sen    [17] 
who  considered   the  corresponding  univariate   case    (that   is,    for   the   case  of 
p=l) ,   wc   propose 

(2.11)  4j  "   (A^^,...,A^p),   Aj..   =   (A^+Ap/2,  j  =  l,...,p 

as   the  estimator  of  A  for  suitable   functions 

(2.12)  ji,^=  (^^l'•••'V• 

Here  also  Che  estimator  A   forms  a  general  class  of  estimates.   Important 
members  of  this  class  are  the  Wilcoxon  estimator  A.,^^n  and  the  normal  scores 
estimator  y^,..  resulting  from  (2.11)  by  taking  for  'i'.(x)  the  rectangular 
distribution  over  (0,1)  and  the  standard  normal  distribution  respectively. 
The  Wilcoxon  estimator  Aj^/„-v  turns  out  to  be  the  vector  of  medians  of  the 
set  of  mn  differences  (Y.^-X.  ),  a=l,...,m,  3=1,..., n;  j=l, . . . ,p,( [7].) 
However,  the  normal  scores  estimator  Av/-^,n  bas  to  be  computed  by  tho  trial 
and  error  method  as  in  the  one  sample  problem.   Proceeding  precisely  as  in 
[7] >  it  can  be  shown  that  the  estimator  A   is  translation  invariant  and  is 
absolutely  continuous.   Furthermore  as  in  the  univariate  case  [.7],  the  dis- 
tribution  of  /^     is  diagonally  symmetric  about  A  if  either 

(2.13)  ^(/^)  is  diagonally  symmetric  about  its  median  vector 
and 

(2.14)  h.(X.T,...,X.  ;Y.,,...,Y.  )+h.(-X.T,...,-X.  ,-Y.,,...p{.    )  =  2u,. . 

J   jl      jm'  jl      jn   J   jl'      jm*   jl'    '^  jn     T.j 

or 

(2.15)  h,(X.,,...,X.  ,Y.^,...,Y.  )+h.(Y.- Y.  ,X.,,...,X.  )  ■=  2y  . 

J   jl'     jm  jl      jn   J   jl      jn*  jl'     jm     ^nj 


and  the  sample  sixes  m  and  n  arc  equal. 

It  is  easy  to  verify  that  both  the  normal  scores tvw/ the  Wilcoxon  csti- 
ators  are  diaconally  symmetric  about  A. 


m 


3.   Asymptotic  Normality.   In  the  preceding  section  we  mentioned  the  small 
sample  properties  of  the  estimators.  In  the  remaining  sections,  we  shall  be 
concerned  with  their  large  sample  behavior.   To  this  end  we  introduce  the 
following  notations. 

We  denote  the  marginal  cdf  of  Z.   (or  Y .  )  by  F .  (x) ,  and  that  of 


F(x)  is  absolutely  continuous,  so  are  F.(x)  and  F.  ,  (x,y).   Next,  denote 


ce 


(3.1) 


/  j2(x)dx  -  (/  J. (x)dx)2   if  j=k=l,...,p. 


jk 


/  /  J.[F.(x)]J,  [F,  (y)]dF.  ,  (x,y)  -  /  J .  (x)dx  /  J,  (y)  dy. 

Jj        K.K.         J,K.  0  0 


_0O  —00 


if  J7^k=l, . .  .  ,p, 
3   J 


(3.2) 


c^   "     f    [dJ.[F.(x)]/dx]dF.(x);   J.^Y 


-1 


(3.3) 


'5,^ 


}        2  -1 

Jj:/J!;'^(x)dx,   if  j=k=l,...,p  ,  J*  "  T-: 


kf     /  J^[F^(x)]Jj^[Fj^(y)]dF^^j^(x,y),  if  j?^k=l,  . . .  ,p 


_00  _CD 


To  prove  the  asymptotic  normality  of  ^  and  A^  we  make  use  of  the 
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following   theorems    tlic.   proofs   of  which   arc   immcclIaLc:   con.sc.-qucucc:;   of    Liic  ir.ore 
general  results   in    [18,   Theorem  A.]]    and    [13,  Theorem  6.1]    respective]/. 


Theorem 


3.1.   Let  a  =  (a,,..., a  )  he  n   vector  of  constants  and  let  h-';  be 
• '\^     jl    •  p 'Ci\  — 


p.iven  by  (2.2)  and  (2.3)  with  V';  satisFyinp,  the  assumptions  I,  II,  III 
and  IV  of  section  2,    then  the  random  vector 

has  asymptotically  a  p-variate  normal  distribution  with  mean  vector 

hia^c^ ,. . . ,a   c  )  and  covariance  matrix  F=(f.,)  defined  by  (3.3). 
1  i      p  p  jlc  -i 

f^=  ^^1 %^^' 

Theorem  3.2.   Let  ^  =  (a^ , . . . ,a  )  be  a  vector  of  constants  and  let  h  be 

given  by  (2.12)  and  (2.8)  with  V.  satisfying  the  assumptions  II,  III 

and  IV  of   section  2.   Then  if  m/N->-X  as  N-x»  with  0<A<1,  the  random 

v^^tor  N^{h^^  .  (x^,  .  .  .  ,x^^^,Y^-A^-a.N"^,  .  .  .  ,Y^-A  .-a  .N~^)-y^,^  ,  j  =  l.  ...,?} 

has  asymptotically  a  p-variate  normal  distribution  with  mean  vector 

(1-X) (a^c^ , . . . ,a  c  )  and  covariance  matrix  D=(d.,)  defined  by  (3.1). 
1  i      p  p  jk  ' 

The  following  theorem  establishes  the  asymptotic  normality  of  f) 


hr 


Theorem  3.3.   (a)  Under  the  assumptions  of  Theorem  3.1  N  (j^  -9)  has  asymp- 
totically a  p-variate  normal  distribution  with  mean  vector  0  and 


dispersion  matrix  A=(X.,)  where 
JK  


(3.4)  X^j^  =  4f^j^/c^Cj^,   j,k=l,...,p 


(b)  Under  the  assumptions  of  Theorem  3.2,  ^^(/y^-/S)    has  asymptotically 
a  p-variate  normal  distribution  with  mean  vector  0  and  dispersion 


matrix  X^Cx.,)   where 
<3.5)  X       -   x(l-A)'c.c,  '   J.k-l.....p. 

Proof.  For  two  p-vcctors  ;^  and  ^,  let  ?^  ^  2(^   denote  the  coordinate  wise  in- 
cciualitics  f^,   £  ^.  for  all  i°l,...,p.  Then  by  the  definition  of  Q*   and  ji^** 

Since  ^*  and  ^**   have  absolutely  continuous  cdfs,  it  follows  that 
and 

(3.6)  and  (3.7)  along  with  the  definition  of  5   imply 

(3.8)     nk%i-^r"-,l^-^)<}tp    =  P[V^]  ^  P[;i(^i-Ai ^N-Ai><-ldN^' 

From  (3.8),  it  follows  that 

The  rest  of  the  proof  follows  as  an  application  of  Theorem  3.1.   The  proof 
of  part  (b)  is  completely  analogous  and  is  therefore  omitted. 

Special  Cases.   A.   Let  J.  be  the  inverse  of  the  standard  normal  distribu- 
J 

tion,  and  Jv  be  the  inverse  of  the  chi-dis tribution  with  one  degree  of 
freedom.   Then  the  estimates  /V  and  0   reduce  to  the  normal  scores  type 

no 


> 


estimates  y^,j/A\  ^nd  J^mciiX  respectively.   In  this  case  the  covariance  matrices 

(J)   (J»        C*   *«> 
T=(t.,  )  and  A=(a.,  )  reduce  to  T  =(t.,)  and  A  "(X.,)  respectively  where 

^■^••^°^  "^jk  "  XC1-X)1J.(F,<I')B,  (F,<I>)  '    ^'^^■^ P 

and 

(3.11)  X^  =  X(l-A)T^j^;   j,k=l,...,p 


where 


1     if  j=k 


(3.12)     Y^k^^'*^^ 


00    00 

.-l.„   ,   V  ,.-1 


/  /  -J*   [F.(x)]<l'  "[F,  (y)]dF.,  (x.y)   if  j?^k 

_oo  _oo        J  '^        J^ 


<»   f^(x)dx 


(3.13)  B  (F,*)  =  /  J ,   j  =  l....,P 

3        _oo  (^[(?~^(F.(x))] 

where  ({)(•)  is  the  density  of  the  standard  cumulative  normal  distribution 
function  <!'(•).   B.   Let  J.  or  J?  be  the  inverse  of  the  rectangular  distri- 
bution  over  (0,1).   Then  the  estimates  A  and  B  reduce  to  the  Wilcoxon  esti- 

maters  ;^v/p\  ^^^  ^\(-a\    respectively.   In  this  case  the  covariance  matrices 

R   R        R   R 
T=(t.,)  and  A=(X.,)  reduce  to  T  =(t  )  and  A  =(X  )  respectively  where 


,2  /„  "N  -.-1 


(3. 14)   .^^ 


[12X(1-X)B^(F,R)]  "     if  j=k 


OO    CO 

[  /  /  F  (x)Fj^(y)dF  j^(x.y)-is]U(l-X)B  (F,R)B  (F.R)]"-^, 

—00  —00   -'  ^  *  J 


and 
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(3.15)  aJ\^  =   X(1-X)tJj^;       j,k=l,...,p 

where 


(3.16)  B.(F,R)    =     /  f^(x)dx;        j=l p. 


In  tlio  \\i-)it  ricction  wo  sUnll  invor.l  i[;.niT,  ilio  perroriii,incc  oC  the  cfitlm.'ttorr. 
B  (and  A  )with  respect  to  the  mean  estimators  Z^  (and  Y-X)  and  the  median 
i'^^tors  4^^^    (and  ^^^^^    =    (^Nl(M)  ' '  '  ' '\p(M)  ^   "^^'^^^ 


est 


(3.17)   A,, .  -^,.  =  median(Y.T Y.  )  -  Median(X .,,...  ,X  .  ),  i  =  l,...,p  . 

^'j(M)  jl'     jn'  jl'    *  jm"  -^   '    '^ 

To  this  end  we  shall  need  the  following  well  kno\Nm  results. 

Theorem  3. A.  (a)  Suppose  that  the  marginal  cdf  F . (x)  is  absolutely  continuous 
for  each  j=l,...,p,  and  its  derivative  at  0  denoted  by  f . (0)  exists 
and  is  finite.   Then 

(3.18)        ^Z  ^^^^%^n)-^^^^   =  ^o,z-^<]^\> '-"%'> 

where  <I>p_  j,^,  is  the  cdf  of  the  p-variate  normal  distribution  with 
vector  Jl^  and  covariance  matrix  Z*  =  (Cf"?r)  given  by 


mean 


i4fj(or    ^^  ^=^ 


(3.19)  a** 

hj 


Vj«'°'V°^-''^^/'j^°^'k^°>    ^f  j 


^k 


and 


(3.20)  IZ  \^^%Hn)-^^^^  -  ^[^,z^/xa-X)&V">^^ 

(b)  Under  the  assumption  that  the  variance  of  the  marginal  cdf  F.(x) 

12 


(3.21) 


exists   ciud   is    finite   lor   each   j-l,...,p 


lim 


N-  V^'^^M^^\1^    =    ^0.^]S"-"V 


where  Z=(a.,  )  and  a,  =cov(Z.  ,Z,  )  when  0=0,  and 
jk      jk    ^  ja  ka      "^ 


(3.22) 


lim 


^Z^^^l^^'^-'V^^   "  *[0.E/X(1-X)]W 


(u^,...,u  ) 


where 


Z  =    (o.,)  and  a.,    =  cov(X.  ,Z,  )  when  A=0. 
jk'      jk       ja'  ka      '^ 

Uemark.   It  may  be  noted  that  the  proof  of  Theorem  3.3  is  not  applicable  to 

derive  the  limiting  distribution  of  fi^w^,^  or  A^,/„x  since  Theorems  3.1  and 

^  '^'N(M)    ^'N(M) 

3.2  do  not  apply  to  the  sign  statistics. 


A.   Asymptotic  Relative  Efficiency.   To  obtain  an  idea  of  the  relative  per- 
formance of  one  estimator  with  respect  to  another,  we  employ  the  notion  of 
"the  generalized  variance"  a  concept  introduced  by  Wilks  (see  [1],  p.  166). 
The  "generalized  variance"  of  a  p-variate  random  vector  (X  ,...,X  )  with 

non-singular  covariance  matrix  T.  =    (p.,a.a,  )  is  defined  to  be  var  X  =  a? .  .  .a^detHo ., 

'V'     jk  J  k  1    p   '"^j, 

where  "det"  denotes  the  determinant. 

Suppose  that  the  two  asymptotically  unbiased  estimators  T  and  T'  of 

rp  rn  rn       rp  "T*!  •T'I'T»I'T»T 

0  with  asymptotically  non-singular  matrices  T.     ■=  (p.,  a. a.)  and  Z   =  (p.,a.  0,  ) 
^  "^jk  J  k'  "^jk  J   k 

require  N  and  N'  observations  to  achieve  equal  asymptotic  generalized  vari- 
ances.  Then  the  asymptotic  relative  efficiency  of  T  with  respect  to  T'  is 
defined  as 


(A.l) 


lim  N^ 
'T,T'  °  N-*<»  N 


°1  ■"%        ^^^11  P.jkl 

T     T    ,   II  T  I 
0^   ...Qp   det  II  P.J 


-1 
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Now  from  (3./.),  (3.5),  (3.18),  (3.20),  (3.21)  and  (3.22)  the  ccneralizcd 
variances  of  ^^,   ^^,   4^^,^^.  J^^^^^^,  7^^^,  and  l^-^^   arc  given  by 

(^..2)  ^^^i^\]  -  ^f^^^jj/^P^'^^  IIp-JI 

where 
(A. 3) 


(A. A)  var[N^4]    =    (   n     -,-^J^^)    det  |lp^J| 


whcr 


e 


and 


^'•^>  ^jk  =  ^jk/'^Jj4'   ^'^=^ p- 


(^•6)  var[N%^^^]    =    ill     ■^^)    det:l|p\| 


[X(l-A)]-  var(N^^    ) 


where 


(A.  7)  p^.^ 


r^^o^^Ja^O'  ^ka^O)-^^^   i^  J'^k 


/l     if  j=k 


IT  2' 


(^•8)  ^^^tN\^^^  =  tA(l-A)]P  var[N'^(Y^-X^,)] 

=  (  n  o')   dot  Up  li 
j=i  -^ 

Hence  denoting  e^  ^^  and  e^  ^^  as  the  efficiency  of  T  with  respect  to  T*  for 
the  one-sample  and  the  two-sample  problems  respectively,  we  obtain 
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(A.  9) 


(2) 

t       /^  _  — 


P 

:_T      J      J  JJ 


-1 


[dccIlp.^Jl/dcL||p^ JI   ]P 


-1 


J-^ 


(])        (?)  (]) 

.Tnd  similar  type  of  expressions  for  e,f   -  >,  c.'j-  ^   .    and  e,'   /n  ,.   Our 


m 


ain  ineercst  is  Co  study  the  relative  asymptotic  performances  of  [\,,.^, 

VI  (v; 

iN-(R)»  &(M)  "^"^  Sn'^N  ^^  ^'^'^  Lwo-sa.nple  Case  and  t'nose  of  ^^,^.^,  ^.^p^^, 
P  ,  X  "and  Z   in  the  corresponding  one-sample  case.   [Ac  this  stage  we  would 
also  like  to  draw  the  attention  of  the  reader  to  a  paper  of  Bickel  [2]   who 
earlier  proposed  tiie  estimate  I'v/nx  if  ^  and  investigated  its  asymptotic  per- 

^  _ 

formance  with  respect  to  the  estimates  ^K.,^,^    and  Z^. .   Since  the  estimate 

•^  "Wi  (M)      '^N 

f)  ,  V  is  only  a  particular  member  of  a  relatively  more  general  class  of  our 
estimators,  some  of  our  results  duplicate  his  results.   However,  this  duplica- 
tion is  retained  here  in  order  to  provide  comprehensive  comparison  and  for 
the  ease  of  reference.] 

Thus  applying  the  definition  (A.l),  we  have 


(A. 10)   e 


(2) 


(1) 


^N(4')'^N~'^    '^N(<I>)"^N 


n  a?B^(F,0) 
3=1^  ' 


\-' 


^^HIp.j^II  ]p 


-1 


detllpjjl 


where  B.(F,0)  is  given  by  (3.13)  and 

■J 


(A. 11) 


^jk 


iX 


e  ' 


f   i=k 


i  00         00 

/     /   0'^[F.(x3(I>~^[F   (y)]dF.    ^(x,y)        if  j?^k 

_oo    —on  J 


(A. 12) 


J2)  _      (1) 

^(R)''^n"'^         'Sn(r)'^n 


n   12at(lit(F,R)) 


R  .-' 

.[detllp.J|/detllp^^J|]P 
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wliorc  1}  (l',R)  is  givt-a  hy  (3.16)  and 


(A. 13) 


(A. 14) 


R 


/   /  F  (x)F  (y)dF.   (x,y)-!i 

.CO  _co    -^  J  * 


if  j=k 


J2)    ^     ^  Jl) 

4v(<I')"^N(R)     '^N^''''^'4l(R) 


n  B^(F,<P)/12B^(F,R) 


b-i^ 


-1 


det  |p 


if  j^/k 


R 


dec  jp 


jk' 


(4.15) 


M 


.(2)      _   (1) 


P   B^CF.'i') 

n   -A- 


.    T  Af?(0) 


-1 


det  II  pV 


-1  -1 
P 


det  IIpJJI 


where  p   is  given  by  (A. 7). 

For  the  case  of  p=l,  the  above  efficiencies  take  the  familiar  expres- 
sions for  which  the  minimum  bounds  are  known  to  exist  in  literature  (cf. 
[5»7»8,11]).   However,  for  general  multivariate  problems,  it  is  not  possible 
to  find  the  bounds  of  the  above  efficiency  expressions  for  arbitrary  F(x), 
as  they  depend  upon  the  association  patterns  of  the  p-variates  in  F(x).   How- 
ever, a  few  special  cases  may  be  considered  as  follov^s: 


Case  1.   Totally  Symmetric  Case.   A  bivariate  random  vector  (X,Y)  is  said  to 
be  totally  symmetric  is  (X,Y) ,  (X,-Y)  and  (-X,-Y)  have  the  same  distribution 
function.   A  sufficient  condition  for  the  asymptotic  independence  of  the  com- 
ponents  of  §^(  J^)  ,  ^^^^^    (or  ^(m)^'  Im  ^°^  ?N~4^  ^^  ""^'^   ""''^^   symmetry  of 
(Z.i,Z^)  (or  (^^i»^u.i))>  ([13,  18]).   In  this  case  the  covariance  matrices 
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rcilucc    CO   the  diagonal  matrices,    and  wc   have 


(A. 16) 


(2) 


(1) 


'^(u)''^m"'^n       'VnciO'^'^n 


12 


11  a'liUi-.K) 


-1 


(A. 17) 


(2) 


QA   '       '  J.  „  O 


.,«'> 


^(<^)♦'^N"'^N  ''^N(<l')'^N 


II  o^ii'(r,<;)) 
Lj=i  ^  ^ 


^P-^ 


(A. 18)  e 


(2) 


(2) 


^N  (  0)  A'  (  R)  -^N  ( <!')  "^^N  ( R) 


12 


n   B^(F,'3))bT    (F,R) 


-1 


(A. 19) 


e;   \-     =    A 


^P-^ 


u 


n  afff(O) 

i=i  " 


J  J 


Now  since  (from  the  results  of  the  univariate  theory  see,  for  example  [3], 
[8]  and  [11]) 


and 


r2n? 


12alB.(F,R)  >  0,864   for  all  F. 


a^B^(F,C))  >  1   for  all  F. 
B^(F,<I))b7^(F,R)/12  >  ~       for  all  F. 


Aa^f!(o)  ^  0.33   for  all  unimodel  F. 


it  follows  that 
(A. 20) 

(A. 21) 

(A. 22) 


i'lle^         .y     =0.864 

inf 

Fe?  ^9^,  .,";\  " 

inf  IT 

F£?  G^,  .,0^,  ^   6 
-~N,4>'~N,R 
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(''..23) 


FcJ^v'^n  /  =0.33 


whore  c?  is  the  set  of  all  toLcilly  symmetric  absolutely  continuous  p-variatc 
distrihulious  and  0'"  is  the  set  of  all  unimodal  totally  symmetric  absolute- 
ly continuous  p-variatc  distributions. 

Now  suppose  that  the  components  of  F  are  totally  symmetric  as  v;cll  as 
identically  distributed,  then  the  efficiencies  are  independent  of  p,  and  hence 
the  same  as  in  the  case  of  one-sample  univariate  situations. 


Case  2 .   Equally  Correlated  Distributions.   Let  us  now  assume  that  the  distri- 
bution of  (Z .  ,  Z  )  (or  (X.  ,X  ))  is  independent  of  j  and  k  when  B=D    (or  A=0) 


Then,  it  turns  out  that 
(A. 24) 


'^N(R)'^N    '^N(R)'^N~'^ 


12a^B2(F,R)  f 


l+(p-l)p 


l+(p-l)p'- 


P    1-P 


_l-p*_ 


1-p 


-1 


(A. 25) 


gjl)   _  ^  ^J2) 
'^N('I>)'^N     4' (''*') ''^n"^ 


o2r2cp  mF  1+(P-1)P  1p" 
Vl^^'*^ll+(p-l)p-^'^j 


1-P  1  1-1 


-1 


1-P*i 


(A. 26) 


fi       f\  ~   A       A 

'^N(<i')'AiN(R)     ^(<i')'A^(R) 


K^^>^^\   (F.^)  [l.-(p-l)p-  ]p-^  fl-p^  U-p-^ 


12 


[ i2i(j>j:i)^L.i  p"  |"i-P"  ll 
[l+(p-l)p'V^^J    Ll-P''"\ 


(A. 27)  e-^"""^   -  =  e-^^^  -  _  =  4o^f^(0) 


1-p  1 1-p"-^  [l+(p-l)p 


1-p 


M 


.l+(p-l)p' 


M 


1-P 


-1 


In   pnrLicul.-ir,    if   X.-,    or  Z .  ,    ■=  U.,,-U,    who.rc   U,,...,U    ,^    arc   indcpcn- 

i-L       ll     1+1   1  1        p-rl  ' 

dent  and  identically  distributed  and  symmetric  random  variables,  then  from 
[2] ,  one  obtains 


(A. 28) 


,(1) 


=  c 


.(2) 


-1 


which  approaches  3o  f  (o)  as  p-^^* 


p-i-2    i  i 


,2r,Z, 


(A.  29)   6a^B^(F.R)(pH-l) 


P"'  <  e  ^^> 


.(2) 


;  ^-^^       -     =  c-^  '^   -  -   <  12a^B^(F  R) 
'^N(R)''^N    '^N(R)'^n"'^  ~     11' 


If  U,  is  normal  or  rectangular,  tlien  e,,     -   is  bounded  below  by  .925  and 

<\(R)''^N 
.88  respectively  for  all  p.   Finally 


(A. 30) 


and 


(4.31) 


..CD   ^     _  ,J2)         ^ 

'^N(<5>)"^N(R)   ^:.-(<;>)'^x(R) 


B^(F,<:))B^^F,R) 
12 


Case  3.   Normal  Case.   Let  the  underlying  distribution  function  F  be  a  non- 
singular  p-variate  normal  distribution  function  with  mean  vector  zero,  and 
covariance  matrix  J  =  ^O.^a.a)    them,  from  (A. 10),  (A. 12),  (A.IA)  and  (A. 15) 
we  obtain 


(A. 32) 


(A. 33) 


(1)         (2)  . 

-^N ($)  '^N    ^N  ( 0)  ''^n"'^ 


(1) 


(2) 


''?N(R)'?N  °  \'(R)'^N-4  "^  ^  ['^'^^  llP^k 


dct  Up. ^11  "i 


-1 


where 
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(A. 3^0 


(4.35) 


(1)  (2) 


''2n(<I')"^N(R)  ^N('t)'\'(K)        "^    L 


dct 

Pv, 

J*^ 

det 

'1-^.^i 

-1 


where   P"v,     is   defined  by    (4.33) 
jk 


(4.36) 


4''   i  -  =/"  s 

^N(R)'^    %(R)"^M 


2 

2 


II  M 


de.  llP.f' 


-1 


From  (4.32),  we  observe  that  in  the  case  of  normal  distribution,  the 
property  of  the  univariate  normal  scores  estimator  relative  to  the  least  squares 
estimator  is  preserved  in  the  multivariate  case.   This  is  interesting  in  the 
sense  that  the  same  is  not  true  with  the  Wilcoxon  estimator  as  Bickel  [2] 
has  proved  that  when  the  underlying  distribution  is  tri-variate  nonsingular 

-h 

normal  with  p..=0  for  i,j  =  l,2,  p..  =  (l-a)2   for  i?^  j  ,  p.  .  =  1  for  i= j ,  the  ex- 
pression (4.33)  can  be  arbitrarily  small.   In  fact,  for  p>2 


(4.37) 


inf   (1)   _    inf   (2)     _ 
^^*'4(R)'^N^^^'^\(R)'V^" 


where  <J>  is  the  set  of  all  nonsingular  p-variate  normal  distributions  [2],  and, 
since 


(4.38) 


it  follows  that 


inf   (1)   _    inf    (2)  _  _    i 


(4.39) 


S'JP  „.(1) 


Fe 


s"P  e.(2) 


*  'Sn(<I.)"^N(R)    ^^'^     ^(<I.)'^N(R) 
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Tlius  we    find    that   wlicii    the   undci"lyinj;   dj..M:rilmLi.ou    fiinction    is   norm.'il, 
tlio  iiai.ltivnrintc   noniinl   scores   csLinunLor  A../,\    or    0.,,.,     (v/hich   is   as   ^ood 
nsympLoLicnlly   as    Che   means   csUiin.itor)    can   be   infinitely   better   than    tl.e 
niiiltivariate  Wilcoxon   type   estimator   A../,o    o^^  r\,^,,\    foi^   p>2.      This   ]ead.o    to 
tlic  question   of   examining   tlie   relative   performance   of    these   tv/o   estimators  viz. 

^XO',')    ^°^   4n(<0^  ^""^  "^NdO  °^  ^^N-CK)-*  ^'^'^  ^'^'^  ^^^^   ^''^''"  '^''^  underlying  distri- 


bution function  F  is  bivariate  nori.ial  lN'(0,a  ,a„  ,p) .   The  efficiency  behavior 
°^   4(0)  ''"'^  JnCR)  ^°^*  4(<))  ''"''^  4(R)^  ^^  ^'''''"'  ^^  '''''  followins. 


Theorem  A.l.   The  efficiency  of  ^^^^^^    £nd  ^^^.^^,^  ^  4,n-(r)  iHlA^C*^)  ^^  indepen- 
dent of  a     and  a  and  is  Riven  by 


(4.A0) 


'^N(R)"^N(v) 


2 


1-p' 


1-9(1  -f  cos-^f)^ 


3^ 
2 


(1+2  cos  V  (if+u)) 
u(iT-u) 


1^ 


.0) 


where  u  is  determined  by  p  =  2  cos  (u+tt)/3.   The  function  ec     „ 

'^N(R)"'2n(0) 
is  monotone  increasing  for  0  <_  p  <  1  and  (2)  synimetric  about  p=0  and 

hence  unimodal.   Finally, 


(4.A1) 


lim 


'P'"^  '^N(R)"^N(<;>) 


( — ^-y  =  0.91. 


7T  sm 


The  proof  is  the  same  as  in  [2]  (see  Theorem  3. 2  [2])  since  when  the 
underlying  distribution  is  normal 


'^N(R)''^N(<:0     '^N(R)"^N 
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Thus  when  the  underlying  distribution  is  bivariate  normal,  c^.     ^ 

which  is  the  same  as  c-^  ^  depends  only  on  p  and  has  a  unique  maxi- 

^^N(R)"^i\(<I') 
mum  when  p=0.   As  in  the  univariate  case  ^vj/d^N  is  better  than  ^m/],\«   In  a 

similar  way  the  efficiency  behavior  of  ^v[/<j,\  relative  to  Vx- /w\  which  is  the 

same  as  that  of  Am/-a\  relative  to  /^fT/w>  may  be  obtained.  Here  again  as  in 

the  univariate  case,  it  can  be  shown  that  ,^^/a\  is  appreciably  superior  to 

Bx,/w\»  The  details  are  omitted. 
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Appeiulix:   An  cxamplo.  of  conipuCauion  of:  9    . 

To  6  patients  suffering  from  hypertension,  n  drug  was  applied  and  the 
fol]ov^;ing  characteristics  were  recorded:   X  (dccrcnf^e  in  blood  pressure), 
Y  (increase  in  the  average  number  of  hours  of  sleep  per  day)  and  Z  (decrease 
in  body-wciglit)  all  over  a  monlh  under  treatment. 


Patients 

X 

Y 

Z    (]bs] 

1 

21 

1.3 

1.8 

2 

27 

0.7 

3.0 

3 

18 

1.9 

0.9 

A 

18 

1.3 

1.5 

5 

15 

0.7 

1.8 

6 

(least 
stimates) 

-3 

0.2 

0.5 

Average 
squares   es 

16 

1.017 

1.583 

It  is  easy  to  see  that  the  median  estimator  ^w/wn  is (18,  1.00,  1.65).   Also, 
if  we  compute  the  median  of  the  21  possible  mid-ranges  for  each  column,  we 
readily  arrive  at  the  Wilcoxon-scores  estimate  B^(r.\    =  (18,  1.00,  1.65), 
which  incidentally  agrees  with  ^>,/v,n«   To  compute  the  normal  scores  estimate, 
we  employ  the  following  iterative  procedure.   The  expected  values  E^   , 
a=l,...,6  of  the  order  statistics  of  a  sample  of  size  6  from  a  chi-distribution 
with  one  degree  of  freedom  are  as  follows 

a     1       2       3       4       5       6 

E,  0.183  0.377  0.589  0.835  1.1A9  1.654 

6, a 

To  compute  the  estimate  of  6-,,  we  start  with  the  trial  value  18  (which  is  the 
Wilcoxon-scores  estimator) .   The  residuals  X-18  are 

3,  9,  0,  0,  -3,  -21. 

To  handle  tied  observations,  we  adopt  the  convention  that  the  total  scores 
for  the  tied  ranks  should  be  equally  divided  among  them.   Thus,  the  value 
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of  h*  ,  defined  in  (2.3),  for  9^  »  18.0  is  0.4035,  which  is  greater  than 
\i*   -  0.399.   If  we  work  with  18  ,  the  value  of  h*  drops  to  0.^90,  which 
is  less  than  0.399.   Consequently,  the  estimate  is  obtained  as  18.   In  a 
similar  way,  the  normal  scores  estimates  of  6„  and  9-  can  be  obtained  as  1.00 
and  1.65  respectively.  Thus,  in  this  example,  all  the  three  estimates 

^N(M)'^N(R)  ^"'^  ^N(*)  ^°i"<^i^^-   ^^  general,  for  small  samples,  they  are  very 

close  to  each 
other,  and  hence,  9  ,  v  may  be  used    in  the  iterative  procedure  for  com- 
puting 5»i/^\;  the  process  should  converge  very  rapidly.   The  two-sample 
situation  can  also  be  treated  similarly.   For  brevity,  the  details  are 
omitted. 
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